Cassandra








Install Casandra on ubuntu/debian

Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together Dynamo’s fully distributed design and Bigtable’s ColumnFamily-based data model.

In a cluster, Cassandra nodes exchange information about one another using a mechanism called Gossip. The nodes in a cluster needs to know one another. Nodes named “seed”s are the centre of this communication mechanism. It’s customary to pick a small number of relatively stable nodes to serve as your seeds. Do make sure that each seed also knows of at least one other. Having two nodes is what is preferred.

Lets have a look at how we can bring a Cassandra cluster up with Cassandra 7.x on ubuntu 10.04

First of all you have to install the java/jdk . As that is out of scope for our discussion please do it on your own and let’s start with cassandra.

1) Add the following repositories to your apt sources list

vi /etc/apt/sources.list


deb http://www.apache.org/dist/cassandra/debian 07x main
deb-src http://www.apache.org/dist/cassandra/debian 07x main

2) Import the following keys and add it to apt-key

gpg --keyserver keyserver.ubuntu.com --recv-keys 4BD736A82B5C1B00

gpg --export --armor 4BD736A82B5C1B00 | sudo apt-key add -

gpg --keyserver keyserver.ubuntu.com --recv-keys F758CE318D77295D

gpg --export --armor F758CE318D77295D | sudo apt-key add –

3) Execute

apt-get update

4) and make sure that no error is there with accessing the packages.

Installing cassandra on all nodes(machines) with which we intend to build the cluster.

apt-get install cassandra --yes

5) start the Cassandra Server using this below command

sudo /etc/init.d/cassandra start



Using the Cassandra CLI

The directory CASSANDRA_HOME/bin contains a startup script for launching the CLI. Running CASSANDRA_HOME/bin/cassandra-cli will display a usage list of the valid arguments and options.

To start the CLI and connect to a particular Cassandra instance, launch the script together with -host and -port arguments. In this example, we connect to the default instance “Test Cluster:”

$ ./cassandra-cli -host localhost -port 9160
Connected to: "Test Cluster" on localhost/9160
Welcome to cassandra CLI.
Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit.
[default@unknown]

As the screen output suggests, you can enter a question mark, or help; for more information about commands. For detailed help on a specific command, use help ;.

Note

For every command or statement you enter into the CLI, make sure you enter a semicolon at the end before hitting the return key. If you forget to do this, the CLI echos an ellipsis ( . . . ), which indicates that the CLI expects more input – such as a semicolon, or names and values in other cases.

Creating a Keyspace

You can use the Cassandra CLI commands described in this section to create a keyspace. In creating an example keyspace for Twissandra, we will assume a desired replication factor of 1 and implementation of the SimpleStrategy replica placement strategy. For more information on these keyspace options, see Clustering.

Note the single quotes around the string value of placement_strategy:

[default@unknown] create keyspace twissandra
with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
and strategy_options = [{replication_factor:1}];

You can verify the creation of a keyspace with the show keyspaces command. The new keyspace is listed along with the system keyspace and any other existing keyspaces.

Creating a Column Family

For this example, we use the CLI to create a users column family in the example Twissandra keyspace. Column metadata is defined for the name of the password column and for its validation class to ensure that UTF8Type is used.

Note the use command to connect to the twissandra keyspace.

[default@unknown] use twissandra;
Authenticated to keyspace: twissandra
[default@twissandra] create column family users with comparator = UTF8Type
... and column_metadata = [{column_name: password, validation_class:
... UTF8Type}];
ade3bc44-236f-11e0-8410-56547f39a44b

Similar commands to create the columns families for Twissandra tweets, followers, userline and timeline would look like the following:

[default@twissandra] create column family tweets with comparator = UTF8Type and
column_metadata = [{column_name: body, validation_class:
UTF8Type}, {column_name: username, validation_class: UTF8Type}];
ba95d891-2cb5-11e0-9c0d-e700f669bcfc

[default@twissandra] create column family friends with comparator = UTF8Type;
71f22752-2cb6-11e0-9c0d-e700f669bcfc

[default@twissandra] create column family followers with comparator = UTF8Type;
81067983-2cb6-11e0-9c0d-e700f669bcfc

[default@twissandra] create column family userline with comparator = LongType and
default_validation_class = TimeUUIDType;
276b8544-2cb7-11e0-9c0d-e700f669bcfc

[default@twissandra] create column family timeline with comparator = LongType and
default_validation_class = TimeUUIDType;


Inserting and Retrieving Columns

Though in production scenarios it is more practical to insert columns and column values programatically, it is possible to use the Cassandra CLI for these operations. The example in this section illustrates using the set and get commands to insert and retrieve some columns in the users column family.

The following commands create and then get a user record for “jsmith.” The record includes a value for the password column we created when we created the column family. Note that the user name “jsmith” is the row key – not a column.

[default@twissandra] set users['jsmith']['password']='ch@ngem3';
Value inserted.
[default@twissandra] get users['jsmith'];
=> (column=password, value=6368406e67656d33, timestamp=1295635612024000)

Note

For all CLI write and read operations such as these example commands, the consistency level is ONE. Different consistency levels are not available with the CLI, though all levels are available when writing/reading programatically.

Using Human-Readable Data

Applications on Cassandra may require values in bytes type or other formats that are not naturally human-readable. The CLI allows you to temporarily translate machine-readable data to human readable formats in order to work with data more easily.

To retrieve a more human-readable value for a particular value such as the UTF8Type password value used above, add as ascii to the get command:

[default@twissandra] get users['jsmith']['password'] as ascii;
=> (column=password, value=ch@ngem3, timestamp=1295635612024000)

Or, for more comprehensive translation of values, you can use the assume command. This command lets you select any one of the following attributes to view as a specified type:

keys (row key)
validator
comparator
sub_comparator (for sub-columns in a column family of type super)

For example, we could issue a command that assumes integer type for the row keys in the users column family. This would render human-readable values like ‘jmith’ and ‘jbellis’ as numbers; then, if we further assumed the comparator for users as integer, we could take advantage of that sort order to perform range queries (given the appropriate partitioner, and other conditions). The following example displays the users rows keys as integers:

[default@twissandra] assume users keys as integer;
Assumption for column family 'users' added successfully.
[default@twissandra] list users limit 2;
-------------------
RowKey: 118070570874734
=> (column=8097880544751088228, value=6368406e67656d33, timestamp=129710078100)
-------------------
RowKey: 29944535281592691
=> (column=8097880544751088228, value=6368406e67656d33, timestamp=129624362200)
2 Rows Returned.

You can use both assume and as with any one of these valid type values:

bytes
integer
long
lexicaluuid
timeuuid
utf8
ascii

Setting an Expiring Column

When you set a column in Cassandra, you can optionally set an expiration time, or “time-to-live” (ttl) attribute for it. In this example we will imagine that the user jsmith gets assigned a web session token that lasts for ten days before he needs to log in again. To accomplish this, the column session_token is set with a ttl value of 864000:

[default@twissandra] set users['jsmith']['session_token'] = 'ten' with ttl=864000;
Value inserted.
[default@twissandra] get users['jsmith'] as ascii;
=> (column=password, value=ch@ngem3, timestamp=1295635612024000)
=> (column=session_token, value=74656e, timestamp=1295898172256000, ttl=864000)
Returned 2 results.

After ten days, or 864,000 seconds have elapsed since the setting of this column, its value will no longer be returned by read operations. Note, however, that the value is not actually deleted from disk until normal Cassandra tombstoning and compaction processes are completed.

Indexing a Column

The CLI can be used to create secondary indexes, or indexes on column values. In this example, we will update the column family users with the new columns state and birth_date – the latter of which will be indexed. Note the the index_type specification at the end of the last line of this example command:

[default@twissandra] update column family users with comparator = UTF8Type
... and column_metadata = [{column_name: password, validation_class:UTF8Type}
... {column_name: state, validation_class: UTF8Type},
... {column_name: birth_date, validation_class: LongType, index_type: KEYS}];

Because of the secondary index created for the column birth_date, its values can be queried directly for users born in a given year as follows:

[default@twissandra] get users where birth_date = 1973;
-------------------
RowKey: jsmith
=> (column=birth_date, value=1973, timestamp=1296677866680000)
=> (column=password, value=ch@ngem3, timestamp=1296243370505000)
=> (column=session_token, value=74656e, timestamp=1295898172256000, ttl=864000)
=> (column=state, value=UT, timestamp=1296677364573000)

Using the CLI you can also create a secondary index on an existing column. For example, we could update the users column family to add index_type: KEYS for the state column:

[default@twissandra] update column family users with comparator = UTF8Type
... and column_metadata = [{column_name: password, validation_class:UTF8Type}
... {column_name: state, validation_class: UTF8Type, index_type: KEYS}];},
... {column_name: birth_date, validation_class: LongType, index_type: KEYS}];

Because of the secondary index created for the state column, its values can be queried directly for users in a given state. Additionally, Cassandra could perform a range query on birth_date now that the state column is indexed, using the state predicate as the primary and filtering on the other with a nested loop:

[default@demo] get users where state = 'TX' and birth_date > 1970;
RowKey: jbellis
=> (column=birth_date, value=1975, timestamp=1291333936242000)
=> (column=password, value=ch@ngem3, timestamp=1296243370505000)
=> (column=state, value=TX, timestamp=1291334909266000)

Retrieving Multiple Rows and Columns

Retrieving multiple rows and performing updates on column values is among the operations where high-level client APIs give you much more powerful functionality than the CLI. However, you can use the CLI to retrieve all columns in a row, specific columns from a row, or a list of all rows in a column family.

To get all columns for a row key, as we have demonstrated in some of the above examples:

[default@twissandra] get users['jsmith'];
=> (column=password, value=6368406e67656d33, timestamp=1295635612024000)
=> (column=session_token, value=74656e, timestamp=1295898172256000, ttl=864000)

To get a specific column or columns for a row, you can specify the column in the get command. Note the use of as to retrieve a human-readable value:

[default@twissandra] get users['jsmith']['session_token'] as ascii;
=> (column=session_token, value=ten, timestamp=1295898172256000, ttl=864000)

You can retrieve all rows in a column family using the list command, optionally controlling the number of records retrieved by specifing a limit value as shown:

[default@twissandra] list users limit 5;
-------------------
RowKey: kbrown
=> (column=password, value=ch@ngem3, timestamp=1296243429992000)
-------------------
RowKey: jbellis
=> (column=birth_date, value=1975, timestamp=1291333936242000)
=> (column=password, value=ch@ngem3, timestamp=1296243370505000)
=> (column=state, value=TX, timestamp=1291334909266000)
-------------------
RowKey: jsmith
=> (column=birth_date, value=1973, timestamp=1296677866680000)
=> (column=password, value=ch@ngem3, timestamp=1296243370505000)
=> (column=session_token, value=74656e, timestamp=1295898172256000, ttl=864000)
=> (column=state, value=UT, timestamp=1296677364573000)
-------------------
RowKey: jbrown
=> (column=password, value=ch@ngem3, timestamp=1296243409963000)
=> (column=session_token, value=ten, timestamp=1296505684508000, ttl=864000)
-------------------
RowKey: msmith
=> (column=password, value=ch@ngem3, timestamp=1296243420785000)

5 Rows Returned.

Deleting Rows

The Cassandra CLI provides the del command to delete a row, column or subcolumn. In this example we will delete user jbrown’s session token column, and then delete jbrown’s row entirely.

[default@twissandra] del users['jbrown']['session_token'];
column removed.
[default@twissandra] get users ['jbrown'];
=> (column=password, value=6368406e67656d33, timestamp=1296243409963000)
Returned 1 results.
[default@twissandra] del users ['jbrown'];
row removed.
[default@twissandra] get users ['jbrown'];
Returned 0 results.

Note, however, that the phenomena called “range ghosts” in Cassandra may mean that keys for deleted rows are still retrieved by list commands or other get operations. Deleted values, including range ghosts, are removed completely by the first compaction following deletion.

Dropping Column Families and Keyspaces

With Cassandra CLI commands you can drop column families and keyspaces in much the same way that tables and databases are dropped in relational models. This example shows the commands to drop our example users column family and then drop the twissandra keyspace altogether:

[default@twissandra] drop column family users;
ade3bc44-236f-11e0-8410-56547f39a44b
[default@twissandra] drop keyspace twissandra;
30448a50-28d8-11e0-9c0d-e700f669bcfc


Insert data into Cassandra database using java


import org.apache.cassandra.thrift.*;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TFramedTransport;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;

import java.nio.ByteBuffer;

/**
* Cassandra Insert example.
*/
public class InsertExample
{
public static void main(String[] args) throws Exception {

TTransport transport = new TFramedTransport(new TSocket("localhost", 9160));
TProtocol protocol = new TBinaryProtocol(transport);
Cassandra.Client client = new Cassandra.Client(protocol);
transport.open();

client.set_keyspace("tutorials");

// define column parent
ColumnParent parent = new ColumnParent("User");

// define row id
ByteBuffer rowid = ByteBuffer.wrap("100".getBytes());

// define column to add
Column description = new Column();
description.setName("description".getBytes());
description.setValue("I’m a nice guy".getBytes());
description.setTimestamp(System.currentTimeMillis());

// define consistency level
ConsistencyLevel consistencyLevel = ConsistencyLevel.ONE;

// execute insert
client.insert(rowid, parent, description, consistencyLevel);

// release resources
transport.flush();
transport.close();
}
}


Retrieve data from Cassandra database using java


import java.io.UnsupportedEncodingException;
import java.util.Date;
import java.util.List;
import org.apache.cassandra.thrift.Cassandra;
import org.apache.cassandra.thrift.Column;
import org.apache.cassandra.thrift.ColumnOrSuperColumn;
import org.apache.cassandra.thrift.ColumnParent;
import org.apache.cassandra.thrift.ColumnPath;
import org.apache.cassandra.thrift.ConsistencyLevel;
import org.apache.cassandra.thrift.InvalidRequestException;
import org.apache.cassandra.thrift.NotFoundException;
import org.apache.cassandra.thrift.SlicePredicate;
import org.apache.cassandra.thrift.SliceRange;
import org.apache.cassandra.thrift.TimedOutException;
import org.apache.cassandra.thrift.UnavailableException;
import org.apache.thrift.TException;
import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import java.nio.ByteBuffer;

/**
* Cassandra Retrieve example.
*/
public class RetrieveExample
{
public static void main(String[] args) throws Exception {

TTransport transport = new TFramedTransport(new TSocket("localhost", 9160));
TProtocol protocol = new TBinaryProtocol(transport);
Cassandra.Client client = new Cassandra.Client(protocol);
transport.open();

client.set_keyspace("tutorials");

// define column parent
ColumnParent parent = new ColumnParent("User");

// define row id
ByteBuffer rowid = ByteBuffer.wrap("100".getBytes());

// define consistency level
ConsistencyLevel consistencyLevel = ConsistencyLevel.ONE;

// execute retrieve

SlicePredicate predicate = new SlicePredicate();
SliceRange sliceRange = new SliceRange();
sliceRange.setStart(new byte[0]);
sliceRange.setFinish(new byte[0]);

predicate.setSlice_range(sliceRange);

List results =
client.get_slice(
rowid,
parent,
predicate,
ConsistencyLevel.ONE
);

for (ColumnOrSuperColumn cosc : results) {
System.out.println("column name: " + new String(cosc.column.name));
System.out.println("column value: " + new String(cosc.column.value));
System.out.println("column timestamp: " + cosc.column.timestamp);
}
}


// release resources
transport.flush();
transport.close();
}
}


More specifically, the new features and improvements include:

  • Lightweight transactions allow ensuring operation linearizability similar to the serializable isolation level offered by relational databases which prevents conflicts during concurrent requests
  • Triggers which enable pushing performance-critical code close to the data it deals with, and simplify integration with event-driven frameworks like Storm
  • CQL (Cassandra Query Language) enhancements such as cursors and improved index support
  • Improved compaction, keeping read performance from deteriorating under heavy write load
  • Eager retries to avoid query timeouts by sending redundant requests to other replicas if too much time elapses on the original request
  • Custom Thrift server implementation based on LMAX Disruptor, a high-performance inter-thread messaging library that achieves lower message processing latencies and better throughput with flexible buffer allocation strategies

7 comments:

Divit said...





Really awesome blog. Your blog is really useful for me. Thanks for sharing this informative blog. Keep update your blog.

Cassandra Training Courses

gracylayla said...

You really did a great job. I found your blog very interesting and very informative. The information you provided Install Cassandra on ubuntu/debian blog is worth and very useful for the beginners. Thank you very much.

Unknown said...

Tackle Cassandra Issue "3770 COPY FROM" with Cognegic's Cassandra Technical Support
Essentially these specialized issues looked by the greater part of the clients, when they attempt to COPY FROM then they need to confront the mistake message. Ensure, first you need to check what stage you are utilizing for this. May be this issue really happens because of mistake with python. Indeed, don't stress, simply contact to Cassandra Database Support or Apache Cassandra Support for snappy reaction or support.
For More Info: https://cognegicsystems.com/
Contact Number: 1-800-450-8670
Email Address- info@cognegicsystems.com
Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801

Unknown said...

Confronting Error to Install Cassandra? Contact to Cassandra Technical Support | Cognegic
Assuming over and over having a blunder to introduce Cassandra at that point ensures you have to introduce JRE of rendition 1.7. In the event that this was a new install and in the event that you have no information at that point have a go at cleansing the introduced Cassandra bundle and first simply introduce once more. Expectation by doing this your concern will tackle yet in the event that not then contact to Apache Cassandra Support or Cassandra Customer Service. Cognegic's point is to raise you hell free with moderate value range and propel bolster.
For More Info: https://cognegicsystems.com/
Contact Number: 1-800-450-8670
Email Address- info@cognegicsystems.com
Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801

Unknown said...

Confronting Problem to Execute Cassandra? Contact to Cassandra Technical Support
On the off chance that you discover any piece of issue to execute your Cassandra then without supposing anything contact to Cognegic's Cassandra Database Support or Apache Cassandra Support. At some point because of specialized issues this execution issue will emerge however with right help you can without much of a stretch handle this issue and investigate at the earliest opportunity.
For More Info: https://cognegicsystems.com/
Contact Number: 1-800-450-8670
Email Address- info@cognegicsystems.com
Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801

Adi smith said...

Contact to Cassandra Technical Support to Solve Cassandra Error Message 395 The slip-up 395 demonstrates that "Affiliation Refused" on a very basic level this kind of affiliation denied screw up happens when Kong can't talk with Cassandra. Guarantee, if you didn't present the Cassandra at that point at first acquaint it and utilize kongdb.org with course of action a testing Cassandra event. In any case, in most by far of the cases, we have seen that, resulting to using the kongdb from kongdb.org, customers standing up to a comparable issue. Everything considered, we recommend them to pick exceedingly capable help to decide this kind of issue. We supposing hopefully and world-class reinforce concerning Cassandra Support, our Cassandra Database Support, and Apache Cassandra Support can without quite a bit of a stretch fix your hiccups and influence you to bumble free. For More Info: https://cognegicsystems.com/ Contact Number: 1-800-450-8670 Email Address- info@cognegicsystems.com

anmol singh said...

I just found the info which I am searching for a long time so I am thankful to you thanks.keep up such great job.There is also good source for
cassandra visit Casandra Tutorails